Search CORE

195 research outputs found

The Mirror DBMS at TREC-8

Author: Hiemstra Djoerd
Vries Arjen P. de
Publication venue: National Institute of Standards and Technology (NIST)
Publication date: 01/01/1999
Field of study

The database group at University of Twente participates in TREC8 using the Mirror DBMS, a prototype database system especially designed for multimedia and web retrieval. From a database perspective, the purpose has been to check whether we can get sufficient performance, and to prepare for the very large corpus track in which we plan to participate next year. From an IR perspective, the experiments have been designed to learn more about the effect of the global statistics on the ranking

CiteSeerX

CWI's Institutional Repository

Radboud Repository

University of Twente Research Information

The SIKS/BiGGrid Big Data Tutorial

Author: Hiemstra Djoerd
Lammerts Evert
Vries Arjen P. de
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2011
Field of study

The School for Information and Knowledge Systems SIKS and the Dutch e-science grid BiG Grid organized a new two-day tutorial on Big Data at the University of Twente on 30 November and 1 December 2011, just preceding the Dutch-Belgian Database Day. The tutorial is on top of some exciting new developments in large-scale data processing and data centers, initiated by Google, and followed by many others such as Yahoo, Amazon, Microsoft, and Facebook. The course teaches how to process terabytes of data on large clusters, and discusses several core computer science topics adapted for big data, such as new file systems (Google File System and Hadoop FS), new programming paradigms (MapReduce), new programming languages and query languages (Sawzall, Pig Latin), and new 'noSQL' databases (BigTable, Cassandra and Dynamo)

Radboud Repository

University of Twente Research Information

Runtime Optimizations for Prediction with Tree-Based Models

Author: Asadi Nima
de Vries Arjen P.
Lin Jimmy
Publication venue
Publication date: 01/01/2013
Field of study

Tree-based models have proven to be an effective solution for web ranking as well as other problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, given an already-trained model. Although exceedingly simple conceptually, most implementations of tree-based models do not efficiently utilize modern superscalar processor architectures. By laying out data structures in memory in a more cache-conscious fashion, removing branches from the execution flow using a technique called predication, and micro-batching predictions using a technique called vectorization, we are able to better exploit modern processor architectures and significantly improve the speed of tree-based models over hard-coded if-else blocks. Our work contributes to the exploration of architecture-conscious runtime implementations of machine learning algorithms

arXiv.org e-Print Archive

CWI's Institutional Repository

The role of evaluation in the development of content-based retrieval techniques

Author: Vries Arjen P. de
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2000
Field of study

CWI's Institutional Repository

University of Twente Research Information

Multimedia search without visual analysis: the value of linguistic and contextual information

Author: Jong Franciska M.G. de
Vries Arjen P. de
Westerveld Thijs
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2007
Field of study

This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

Better contextual suggestions in ClueWeb12 using domain knowledge inferred from the open web

Author: Bellogín Alejandro
Samar Thaer
Vries Arjen P. de
Publication venue: 'National Institute of Standards and Technology (NIST)'
Publication date: 01/01/2014
Field of study

Proceedings of the 23rd Text Retrieval Conference (TREC 2014), held in Gaithersburg, Maryland, USA, on 2014This paper provides an overview of our participation in the Contextual Suggestion Track. The TREC 2014 Contextual Suggestion Track allowed participants to submit personalized rankings using documents either from the OpenWeb or from an archived, static Web collection, the ClueWeb12 dataset. In this paper, we focus on filtering the entire ClueWeb12 collection to exploit domain knowledge from touristic sites available in the Open Web. We show that the generated recommendations to the provided user profiles and contexts improve significantly using this inferred domain knowledge.This research was supported by the Netherlands Organization for Scientific Research (NWO project #640.005.001

Biblos-e Archivo